NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Differentiable GPU-Parallelized Task and Motion Planning

Shen, William; Garrett, Caelan; Kumar, Nishanth; Goyal, Ankit; Hermans, Tucker; Lozano-Perez, Tomas; Ramos, Fabio (June 2025, Robotics science and systems)

Planning long-horizon robot manipulation requires making discrete decisions about which objects to interact with and continuous decisions about how to interact with them. A robot planner must select grasps, placements, and motions that are feasible and safe. This class of problems falls under Task and Motion Planning (TAMP) and poses significant computational challenges in terms of algorithm runtime and solution quality, particularly when the solution space is highly constrained. To address these challenges, we propose a new bilevel TAMP algorithm that leverages GPU parallelism to efficiently explore thousands of candidate continuous solutions simultaneously. Our approach uses GPU parallelism to sample an initial batch of solution seeds for a plan skeleton and to apply differentiable optimization on this batch to satisfy plan constraints and minimize solution cost with respect to soft objectives. We demonstrate that our algorithm can effectively solve highly constrained problems with non-convex constraints in just seconds, substantially outperforming serial TAMP approaches, and validate our approach on multiple realworld robots.
more » « less
Full Text Available
Coupled Iterative Refinement for 6D Multi-Object Pose Estimation

https://doi.org/10.1109/CVPR52688.2022.00661

Lipson, Lahav; Teed, Zachary; Goyal, Ankit; Deng, Jia (January 2022, IEEE Conference on Computer Vision and Pattern Recognition (CVPR))

Full Text Available
Infinite Photorealistic Worlds Using Procedural Generation

https://doi.org/10.1109/CVPR52729.2023.01215

Raistrick, Alexander; Lipson, Lahav; Ma, Zeyu; Mei, Lingjie; Wang, Mingzhe; Zuo, Yiming; Kayan, Karhan; Wen, Hongyu; Han, Beining; Wang, Yihan; et al (June 2023, IEEE)
PackIt: A Virtual Environment for Geometric Planning

Goyal, Ankit; Deng, Jia (July 2020, International Conference on Machine Leaning (ICML))
null (Ed.)
Full Text Available
PackIt: A Virtual Environment for Geometric Planning

Goyal, Ankit; Deng, Jia (January 2020, International Conference on Machine Leaning (ICML))
null (Ed.)
The ability to jointly understand the geometry of objects and plan actions for manipulating them is crucial for intelligent agents. We refer to this ability as geometric planning. Recently, many interactive environments have been proposed to evaluate intelligent agents on various skills, however, none of them cater to the needs of geometric planning. We present PackIt, a virtual environment to evaluate and potentially learn the ability to do geometric planning, where an agent needs to take a sequence of actions to pack a set of objects into a box with limited space. We also construct a set of challenging packing tasks using an evolutionary algorithm. Further, we study various baselines for the task that include model-free learning-based and heuristic-based methods, as well as search-based optimization methods that assume access to the model of the environment. Code and data are available at this https URL.
more » « less
Full Text Available
Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D

Goyal, Ankit; Yang, Kaiyu; Yang, Dawei; Deng, Jia (January 2020, Advances in Neural Information Processing Systems 33 pre-proceedings (NeurIPS))
null (Ed.)
Understanding spatial relations (e.g., laptop on table) in visual input is important for both humans and robots. Existing datasets are insufficient as they lack large-scale, high-quality 3D ground truth information, which is critical for learning spatial relations. In this paper, we fill this gap by constructing Rel3D: the first large-scale, human-annotated dataset for grounding spatial relations in 3D. Rel3D enables quantifying the effectiveness of 3D information in predicting spatial relations on large-scale human data. Moreover, we propose minimally contrastive data collection---a novel crowdsourcing method for reducing dataset bias. The 3D scenes in our dataset come in minimally contrastive pairs: two scenes in a pair are almost identical, but a spatial relation holds in one and fails in the other. We empirically validate that minimally contrastive examples can diagnose issues with current relation detection models as well as lead to sample-efficient training. Code and data are available at https://github.com/princeton-vl/Rel3D.
more » « less
Full Text Available
Think Visually: Question Answering through Virtual Imagery

Goyal, Ankit; Wang, Jian; Deng, Jia (July 2018, Proceedings of the conference - Association for Computational Linguistics. Meeting)

Full Text Available
Robotics, Automation, and Control

https://doi.org/10.1061/9780784482438.034

Kim, Daeho; Goyal, Ankit; Newell, Alejandro; Lee, SangHyun; Deng, Jia; Kamat, Vineet R. (June 2019, Semantic Relation Detection Between Construction Entities to Support Safe Human-Robot Collaboration in Construction)

Construction robots have drawn increased attention as a potential means of improving construction safety and productivity. However, it is still challenging to ensure safe human-robot collaboration on dynamic and unstructured construction workspaces. On construction sites, multiple entities dynamically collaborate with each other and the situational context between them evolves continually. Construction robots must therefore be equipped to visually understand the scene’s contexts (i.e., semantic relations to surrounding entities), thereby safely collaborating with humans, as a human vision system does. Toward this end, this study builds a unique deep neural network architecture and develops a construction-specialized model by experimenting multiple fine-tuning scenarios. Also, this study evaluates its performance on real construction operations data in order to examine its potential toward real-world applications. The results showed the promising performance of the tuned model: the recall@5 on training and validation dataset reached 92% and 67%, respectively. The proposed method, which supports construction co-robots with the holistic scene understanding, is expected to contribute to promoting safer human-robot collaboration in construction.
more » « less
Full Text Available

Search for: All records